15 research outputs found

    TCBERT: A Technical Report for Chinese Topic Classification BERT

    Full text link
    Bidirectional Encoder Representations from Transformers or BERT~\cite{devlin-etal-2019-bert} has been one of the base models for various NLP tasks due to its remarkable performance. Variants customized for different languages and tasks are proposed to further improve the performance. In this work, we investigate supervised continued pre-training~\cite{gururangan-etal-2020-dont} on BERT for Chinese topic classification task. Specifically, we incorporate prompt-based learning and contrastive learning into the pre-training. To adapt to the task of Chinese topic classification, we collect around 2.1M Chinese data spanning various topics. The pre-trained Chinese Topic Classification BERTs (TCBERTs) with different parameter sizes are open-sourced at \url{https://huggingface.co/IDEA-CCNL}

    CMB: A Comprehensive Medical Benchmark in Chinese

    Full text link
    Large Language Models (LLMs) provide a possibility to make a great breakthrough in medicine. The establishment of a standardized medical benchmark becomes a fundamental cornerstone to measure progression. However, medical environments in different regions have their local characteristics, e.g., the ubiquity and significance of traditional Chinese medicine within China. Therefore, merely translating English-based medical evaluation may result in \textit{contextual incongruities} to a local region. To solve the issue, we propose a localized medical benchmark called CMB, a Comprehensive Medical Benchmark in Chinese, designed and rooted entirely within the native Chinese linguistic and cultural framework. While traditional Chinese medicine is integral to this evaluation, it does not constitute its entirety. Using this benchmark, we have evaluated several prominent large-scale LLMs, including ChatGPT, GPT-4, dedicated Chinese LLMs, and LLMs specialized in the medical domain. It is worth noting that our benchmark is not devised as a leaderboard competition but as an instrument for self-assessment of model advancements. We hope this benchmark could facilitate the widespread adoption and enhancement of medical LLMs within China. Check details in \url{https://cmedbenchmark.llmzoo.com/}

    AceGPT, Localizing Large Language Models in Arabic

    Full text link
    This paper explores the imperative need and methodology for developing a localized Large Language Model (LLM) tailored for Arabic, a language with unique cultural characteristics that are not adequately addressed by current mainstream models like ChatGPT. Key concerns additionally arise when considering cultural sensitivity and local values. To this end, the paper outlines a packaged solution, including further pre-training with Arabic texts, supervised fine-tuning (SFT) using native Arabic instructions and GPT-4 responses in Arabic, and reinforcement learning with AI feedback (RLAIF) using a reward model that is sensitive to local culture and values. The objective is to train culturally aware and value-aligned Arabic LLMs that can serve the diverse application-specific needs of Arabic-speaking communities. Extensive evaluations demonstrated that the resulting LLM called `AceGPT' is the SOTA open Arabic LLM in various benchmarks, including instruction-following benchmark (i.e., Arabic Vicuna-80 and Arabic AlpacaEval), knowledge benchmark (i.e., Arabic MMLU and EXAMs), as well as the newly-proposed Arabic cultural \& value alignment benchmark. Notably, AceGPT outperforms ChatGPT in the popular Vicuna-80 benchmark when evaluated with GPT-4, despite the benchmark's limited scale. % Natural Language Understanding (NLU) benchmark (i.e., ALUE) Codes, data, and models are in https://github.com/FreedomIntelligence/AceGPT.Comment: https://github.com/FreedomIntelligence/AceGP

    Assessment of treatment response during chemoradiation therapy for pancreatic cancer based on quantitative radiomic analysis of daily CTs: An exploratory study.

    No full text
    In an effort for early assessment of treatment response, we investigate radiation induced changes in quantitative CT features of tumor during the delivery of chemoradiation therapy (CRT) for pancreatic cancer.Diagnostic-quality CT data acquired daily during routine CT-guided CRT using a CT-on-rails for 20 pancreatic head cancer patients were analyzed. On each daily CT, the pancreatic head, the spinal cord and the aorta were delineated and the histograms of CT number (CTN) in these contours were extracted. Eight histogram-based radiomic metrics including the mean CTN (MCTN), peak position, volume, standard deviation (SD), skewness, kurtosis, energy and entropy were calculated for each fraction. Paired t-test was used to check the significance of the change of specific metric at specific time. GEE model was used to test the association between changes of metrics over time for different pathology responses.In general, CTN histogram in the pancreatic head (but not in spinal cord) changed during the CRT delivery. Changes from the 1st to the 26th fraction in MCTN ranged from -15.8 to 3.9 HU with an average of -4.7 HU (p<0.001). Meanwhile the volume decreased, the skewness increased (less skewed), and the kurtosis decreased (less peaked). The changes of MCTN, volume, skewness, and kurtosis became significant after two weeks of treatment. Patient pathological response is associated with the changes of MCTN, SD, and skewness. In cases of good response, patients tend to have large reductions in MCTN and skewness, and large increases in SD and kurtosis.Significant changes in CT radiomic features, such as the MCTN, skewness, and kurtosis in tumor were observed during the course of CRT for pancreas cancer based on quantitative analysis of daily CTs. These changes may be potentially used for early assessment of treatment response and stratification for therapeutic intensification

    Comparisons of the average changes of for the good- and poor-response groups.

    No full text
    <p>(a) moments of histogram including MCTN, SD, skewness, kurtosis, and (b) volume in GTV from the first RT fraction during the course of CRT. The error bar is the standard error of the values of the cohort.</p
    corecore